Researching COVID to enhance recovery (RECOVER) pediatric study protocol: Rationale, objectives and design

Importance The prevalence, pathophysiology, and long-term outcomes of COVID-19 (post-acute sequelae of SARS-CoV-2 [PASC] or “Long COVID”) in children and young adults remain unknown. Studies must address the urgent need to define PASC, its mechanisms, and potential treatment targets in children and young adults. Observations We describe the protocol for the Pediatric Observational Cohort Study of the NIH’s REsearching COVID to Enhance Recovery (RECOVER) Initiative. RECOVER-Pediatrics is an observational meta-cohort study of caregiver-child pairs (birth through 17 years) and young adults (18 through 25 years), recruited from more than 100 sites across the US. This report focuses on two of four cohorts that comprise RECOVER-Pediatrics: 1) a de novo RECOVER prospective cohort of children and young adults with and without previous or current infection; and 2) an extant cohort derived from the Adolescent Brain Cognitive Development (ABCD) study (n = 10,000). The de novo cohort incorporates three tiers of data collection: 1) remote baseline assessments (Tier 1, n = 6000); 2) longitudinal follow-up for up to 4 years (Tier 2, n = 6000); and 3) a subset of participants, primarily the most severely affected by PASC, who will undergo deep phenotyping to explore PASC pathophysiology (Tier 3, n = 600). Youth enrolled in the ABCD study participate in Tier 1. The pediatric protocol was developed as a collaborative partnership of investigators, patients, researchers, clinicians, community partners, and federal partners, intentionally promoting inclusivity and diversity. The protocol is adaptive to facilitate responses to emerging science. Conclusions and relevance RECOVER-Pediatrics seeks to characterize the clinical course, underlying mechanisms, and long-term effects of PASC from birth through 25 years old. RECOVER-Pediatrics is designed to elucidate the epidemiology, four-year clinical course, and sociodemographic correlates of pediatric PASC. The data and biosamples will allow examination of mechanistic hypotheses and biomarkers, thus providing insights into potential therapeutic interventions. Clinical trials.gov identifier Clinical Trial Registration: http://www.clinicaltrials.gov. Unique identifier: NCT05172011.


Observations
We describe the protocol for the Pediatric Observational Cohort Study of the NIH's REsearching COVID to Enhance Recovery (RECOVER) Initiative.RECOVER-Pediatrics is an observational meta-cohort study of caregiver-child pairs (birth through 17 years) and young adults (18 through 25 years), recruited from more than 100 sites across the US.This report focuses on two of four cohorts that comprise RECOVER-Pediatrics: 1) a de novo RECOVER prospective cohort of children and young adults with and without previous or current infection; and 2) an extant cohort derived from the Adolescent Brain Cognitive Development (ABCD) study (n = 10,000).The de novo cohort incorporates three tiers of data collection: 1) remote baseline assessments (Tier 1, n = 6000); 2) longitudinal follow-up for up to 4 years (Tier 2, n = 6000); and 3) a subset of participants, primarily the most severely affected by PASC, who will undergo deep phenotyping to explore PASC pathophysiology (Tier 3, n = 600).Youth enrolled in the ABCD study participate in Tier 1.The pediatric protocol was developed as a collaborative partnership of investigators, patients, researchers, clinicians, community partners, and federal partners, intentionally promoting inclusivity and diversity.The protocol is adaptive to facilitate responses to emerging science.

Introduction
Long COVID, or the post-acute sequelae of SARS-CoV-2 (PASC), has been defined as symptoms, signs and conditions that continue or develop after a SARS-CoV-2 infection.These symptoms can affect people for weeks, months or even years after getting coronavirus disease 2019 (COVID-19) [1,2].Symptoms can develop shortly after the initial recovery from an acute COVID-19 episode or persist from the initial illness.Symptoms may also emerge later or fluctuate or relapse over time.These symptoms can have debilitating effects on the daily health and quality of life of those affected.
The COVID-19 pandemic has significantly impacted child health.Nearly 100 million people have been diagnosed with COVID-19 in the United States (US), with nearly 16 million children [3].Although it is estimated that between 10% and 30% of adults experience persistent symptoms from COVID-19 [4], the prevalence in children is less well-established [5,6].As an emerging illness, the absence of universally-accepted PASC definitions in children challenge the elucidation of its epidemiology.
Unique challenges in understanding PASC symptoms in children have likely contributed to the limited evidence.For example, young children might not be able to articulate their symptoms.This has required studies to rely on caregiver interpretation of their young child's symptoms.In addition, manifestation of symptoms may vary substantively across stages of physiological, emotional, and cognitive development [7].As the medical community shifts from managing serious acute disease to addressing long-term consequences, large scale studies are needed to define PASC in children across the life course, to understand its natural history, and to develop evidence to guide successful treatment.
The pandemic began with a misconception that children were spared [8].We now recognize that children and families are greatly impacted during both acute and chronic phases [9][10][11][12].One distinct manifestation in children was recognized in April 2020; now called Multisystem Inflammatory Syndrome in Children (MIS-C) [13].This debilitating hyperinflammatory syndrome has impacted over 9,000 children and young adults in the US [14], and represents a distinct post-acute syndrome that is typically recognizable in clinical practice.Other more chronic manifestations of PASC are challenging to characterize and identify.Furthermore, children with PASC may present with different symptoms and greater mental health concerns than adults [3,[15][16][17][18][19][20][21].Additional phenotypes of childhood PASC are being reported, including phenotypes similar to postural orthostatic tachycardia syndrome (POTS), myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), postintensive care unit syndrome, and potentially many others [22][23][24].Therefore, a compelling rationale exists to invest resources and effort to study PASC in children.The NIH's REsearching COVID to Enhance Recovery (RECOVER) Initiative responded by bringing together researchers, communities, and families in a systematic study of PASC in children [25].Evidence that leads to improved health trajectories of children with PASC, could have population-level health impacts for decades to come.

Study rationale
RECOVER has established the Pediatric Observational Cohort Study (RECOVER-Pediatrics), which is a combined retrospective and prospective longitudinal study, including four distinct cohorts, integrated together as a meta-cohort [25].The overall goal is to characterize the clinical course, underlying mechanisms and long-term health effects of PASC on children and young adults from birth through 25 years old, to inform future pediatric preventive and treatment measures.

Study aims
RECOVER-Pediatrics scientific aims are to: 1. Characterize the prevalence and incidence of new onset or worsening symptoms related to PASC 2. Characterize the spectrum of clinical symptoms of PASC, including distinct phenotypes, and describe the clinical course and recovery.
3. Identify risk and resiliency factors for developing PASC and recovering from PASC.
4. Define the pathophysiology of PASC, including subclinical organ dysfunction, and identify biological mechanisms underlying the pathogenesis of PASC.

Overview of study design
RECOVER-Pediatrics is a longitudinal, observational meta-cohort study of children and young adults (ages birth through 25 years) and their caregivers, recruited from healthcare-and community-based settings in more than 100 sites throughout the US, including Puerto Rico.
Those with and without a history of a SAR-CoV-2 infection are included.For those 17 years or younger, data are collected by caregiver report and child direct assessments, and for those 18 through 25 years old by self-report.The study is being conducted from March 2022 to March 2026.
The pediatric meta-cohort is comprised of four distinct cohorts: 1) de novo RECOVER prospective cohort including children and young adults ages birth through 25 years, with or without a known history of infection, and their caregivers; 2) Adolescent Brain Cognitive Development (ABCD) extant cohort, the largest long-term US study of brain development in adolescence [26,27]; 3) In utero exposure cohort, including children less than 3 years old born to individuals with and without a SAR-CoV-2 infection during pregnancy [28,29]; and 4) COVID MUSIC Study extant cohort (Long-Term Outcomes after the Multisystem Inflammatory Syndrome In Children), including children and young adults with history of MIS-C [30].This report focuses on the de novo cohort and ABCD (Fig 1).
Fig 1 shows a tiered overview of 2 of the 4 cohorts included in the meta-cohort (de novo RECOVER prospective cohort and ABCD), their participation in the three study tiers, and their targeted sample sizes (see Study Participants).Children and young adults ages newborn through 25 years old will be enrolled in the meta-cohort at Tier 1 for the de novo RECOVER prospective cohort (more than 6,000 from birth through 25 years old, including those with and without history of infection), and from ABCD (up to 10,000 adolescents with and without history of infection).All children and young adults enrolled in the study complete a baseline assessment (Tier 1).Percentages shown indicate random sampling proportions.Children and young adults without history of infection are assigned at random with prespecified proportions to the acute and post-acute arms.All children and young adults with history of infection who enroll into the acute arm and those without a history of infection who are randomized to the acute arm are asked to complete assessments at 2, 4, and 8 weeks.Following a promotion algorithm, children and young adults in Tier 1 will be selected to be promoted to Tier 2, which includes assessments at 2-6, 12, 24, 36, and 48 months after enrollment.600 children and young adults with history of infection, selected from Tier 2, will complete more intensive Tier 3 assessments at 12 and 24 months after enrollment.
*Children and young adults with history of infection who enroll in the post-acute arm ("post-acute infected", n = 4,000) are stratified into High, Medium, and Low probability of PASC groups based on a combination of past Long COVID diagnoses, Tier 1 Global PROMIS health measures, and symptom survey screener responses.Then, 100% of the high probability group, 50% of the medium probability group, and 20% of the low probability group are promoted at random to Tier 2. In June 2023, the promotion rate for the medium probability group was increased to 100% to enhance promotion rates and include the wider spectrum of symptom severity within our longitudinal cohort.Since the distribution of these probability groups is unknown a priori, sample sizes are not specified for each category.Overall, the number of children and young adults who progress to Tier 2 will be less than the initial post-acute infected sample size, but the total target sample size for infected children and young adults in Tier 2 is 5,400.
** In order to achieve a sample of 5,400 children and young adults with history of infection in Tier 2 that is skewed towards those with greater likelihood of having PASC, additional children and young adults will be recruited from Long COVID clinics and subspecialty services to complete both Tier 1 and Tier 2 assessments.
RECOVER-Pediatrics is structured in a sequential fashion with three Tiers of data collection.Participants are enrolled initially into Tier 1, which consists of a broad screening of health using remote surveys and biospecimen collection.Participants may subsequently progress to Tier 2, which includes a detailed review of health collected longitudinally for up to four years, using a combination of remote surveys and in-person assessments of biological and psychosocial data.In order to achieve the sample size required for Tier 2 assessments, other study participants will be recruited who present with a high probability of having PASC, such as those directly recruited from a clinic that focuses on Long COVID or presenting with a physician diagnosis of PASC.These participants will receive Tier 1 assessments and progress directly to Tier 2. Finally, in Tier 3, a subset of children and young adults most severely affected by PASC will undergo deep phenotyping with more intensive assessments to study PASC pathophysiology.
RECOVER-Pediatrics Tier 1 assessments aim to characterize the prevalence and incidence of new onset or worsening of sustained COVID-related symptoms (aim 1) and to gain a comprehensive understanding of the impact of exposure to a SARS-CoV-2 infection on broad physical, behavioral and mental health (aim 2).Tier 2 facilitates studying the natural history of PASC symptoms and potential recovery over time (aim 2).Child, household, and caregiver factors gathered in Tier 1, such as social determinants of health and prior health conditions, will be assessed to determine how they increase the risk of or protect against specific clinical outcomes (aim 3).Finally, Tier 3 data investigates long-term effects on multiple organ systems and child development (aim 4).Additionally, integration of Tier 1 and Tier 2 data will allow investigation of COVID-disease exposures and experiences which may be responsible for the clinical patterns observed in Tier 3.
The pediatric protocol was designed through collaboration across key stakeholders, including patients, caregivers, researchers, clinicians, community partners, and federal partners, fostering a patient-centered approach and promoting inclusivity and diversity.The pediatric protocol is adaptive to facilitate the changes needed in light of emerging science and the evolving pandemic.

Study organizational structure and management
Study infrastructure includes four cores: 1) Clinical Science Core (CSC) at the NYU Grossman School of Medicine, which oversees study sites and provides scientific leadership in collaboration with hub and site Principal Investigators; 2) Data Resource Core (DRC) at Massachusetts General Hospital and Brigham and Women's Hospital, which provides scientific and statistical leadership, and handles data management and storage; 3) PASC Biorepository Core (PBC) at Mayo Clinic, which manages biospecimens obtained; and 4) Administrative Coordinating Center (ACC) at RTI International, which provides operational and administrative support; collectively these form the Core Operations Group.The four cores are supported by oversight committees and pathobiology task forces provide content-specific input.RECOVER cohort studies are overseen by the National Community Engagement Group (NCEG) composed of patient and community representatives, a Steering Committee composed of site Principal Investigators and NIH program leadership, an Executive Committee composed of NIH Institute leadership, and an Observational Safety Monitoring Board composed of experts in longitudinal observational studies, epidemiology, bioethics, and biostatistics.RECOVER-Pediatrics includes 10 hubs that manage ~100 sites (S1 Table ), located in more than 39 states, Washington DC and Puerto Rico.Awardees were selected through a process that included independent peer review in response to OTA-21-015B.

Ethics
The study was approved by the NYU Grossman School of Medicine Institutional Review Board (IRB), which serves as the single IRB for the majority of the study sites.A few pre-existing networks use their own central IRBs through an exemption granted by the NIH (e.g., ABCD, MUSIC).Caregivers, for children 17 years old or younger, and young adult participants provide signed informed consent to participate.

Recruitment, consent, and screening strategies
The de novo RECOVER prospective cohort study is recruiting participants from healthcare-and community-based settings.Healthcare-based recruitment involves local media, text messaging, hospital websites, COVID registries, and partnerships with pediatric practices, nurse hotlines, or emergency departments.Community-based recruitment includes partnering with community health workers, school nurses, sports coaches, health fairs, and a mobile van to access rural communities.Participants can also join by self-referral through the RECOVER website, or in response to plain language and picture-based recruitment materials in both English and Spanish, which were developed with community input and using health literacy principles [31].
Eligible dyads complete an informed written consent process at enrollment for Tiers 1 and 2. The consent process may be conducted using telephone, a secure video conference platform approved for exchange of PHI, or in person (using either a signed written consent form or via electronic informed consent [e-consenting]).An assent process is being conducted for children between 7 years and 17 years old.The study team explains the assent document to the child and parent/legal guardian and answers all questions.Child understanding of the key elements of the assent document is assessed by the study team and parent/legal guardian.The child either signs the age-appropriate assent document or provides verbal assent (with documentation in the local records and the central REDcap).Young adults, aged 18 through 25 years old, sign their own informed consent.Tier 3 consent forms will only be completed when testing is offered.A standardized teach back method is implemented as needed to ensure understanding of the key aspects of participation before enrollment.Participants are reconsented if there are major changes to the study design or to anticipated risks.
In ABCD, 11,880 children aged 9-10 years old were recruited from community and school sites to participate in a 10-year study with the goal of understanding neurocognitive development during adolescence [26,27,32].All ABCD participants are being contacted and offered enrollment into RECOVER-Pediatrics.

Eligibility criteria
Children and young adults from birth through 25 years old are eligible to be enrolled in the de novo cohort, regardless of history of SARS CoV-2 infection.Enrolled participants are then categorized as either "infected" or "uninfected": Infected participants have history of suspected, probable, or confirmed SARS-CoV-2 infection, defined by the World Health Organization (WHO) criteria [33], evidence of infection by serum antibody profile, or a history of MIS-C.Uninfected participants are those who self-report as having no history of a SARS-CoV-2 infection and who have never met WHO criteria; they have no evidence of a past asymptomatic infection in their medical history or evidence of past infection by serum antibody profile.
A primary caregiver, defined as an individual responsible for the enrolled child or young adult who resides in the same household, such as biological or nonbiological family member, is invited to enroll.
The primary exclusion criterion is any child or young adult with co-morbid illness with expected survival of less than 2 years.There is no limit to the number of children or young adults who can be enrolled from a single household.See supplemental tables for detailed eligibility criteria, definitions of analytic groups, and the World Health Organization Criteria (S2-S4 Tables).

Study participants
Recruitment is striving for a diverse sample that generally represents the US population, and encourages participation from rural or medically underserved communities, non-English speaking participants, and non-hospitalized participants with an acute COVID-19 infection.Participants are compensated for completing assessments and reimbursed for excess travel.
At least 6,000 participants will be recruited into the de novo cohort (Fig 1).Children and young adults with history of infection are classified into one of two study arms (acute arm vs. post-acute arm), based on their history of SARS-CoV-2 infection and infection dates.The acute arm includes 800 children and young adults whose most recent SARS-CoV-2 infection was 30 days or less prior to enrollment.The post-acute arm includes 4,000 children and young adults whose most recent SARS-CoV-2 infection was greater than 30 days prior to enrollment.In the group without a history of infection, 1,200 children and young adults will be randomly assigned to follow either the acute (200, or 17%) or post-acute (1,000 or 83%) arm of the protocol.Additional children and young adults will be recruited from Long COVID clinics and other subspecialty services in order to achieve Tier 2 sample size targets (see Timing of Study Assessments).
Up to 10,000 participants will also be recruited from the ABCD cohort.

Timing of study assessments
The assessments for the de novo cohort consists of three tiers, which vary in timing, collection methods and intensity.Tier 1 (baseline visit for all participants) includes a single visit that is completed either via self-administration (remote and electronic) or research staff-assisted collection (e.g., telephone, videoconference, or in-person).
Tier 2 (follow-up visits) includes five longitudinal in-person visits at 2 to 6-, 12-, 24-, 36and 48-months post-enrollment.The children and young adults followed longitudinally in Tier 2 are selected based on a sampling scheme that prioritizes the acute arm as well as children and youth in the post-acute arm with a greater likelihood of having PASC.Promotion to Tier 2 occurs as follows: 1) All children/young adults in the acute arm with or without history of infection will be promoted; 2) children/young adults in the post-acute arm with a history of infection will be promoted at a rate dependent on their likelihood of PASC based on prior Long COVID diagnoses, Tier 1 PROMIS global health measure responses [34][35][36], and symptoms screener survey responses [18,37] (Table 1); and 3) 40% of children/young adults without known infection in the post-acute arm, selected at random, will be promoted.In addition to promoting children and young adults from Tier 1, children and young adults will also be recruited from Long COVID clinics and subspecialty services to achieve the target sample size in Tier 2 of 6,000.These children and young adults will complete both Tier 1 and Tier 2 assessments.See Table 2 for a full description of the promotion algorithm.
Children and young adults in the acute arm with a history of infection will also complete remote assessments at 2, 4, and 8 weeks after their infection onset, with additional in-person assessments at 8 weeks.Children and young adults in the acute arm without history of infection will complete the same assessments, timed relative to their enrollment date.All ABCD youth are eligible to participate in RECOVER Tier 1, and can be referred to a de novo cohort site to participate in Tiers 2 and 3, if geographically feasible.
Tier 3 has the most clinically intensive assessments with longitudinal in-person visits for a subset at 12 and 24 months post-enrollment.Tier 3 will include 600 children and young adults with history of infection from Tier 2.

Main categories of data
Data collected for the de novo and ABCD cohorts are described below (Table 3).
Surveys include validated surveys with NIH common data elements, as available, informed by expert opinion (S5 Table ).All are completed using Research Electronic Data Capture (REDCap), with the child's first name coded within surveys to personalize the experience and to clarify which child the questions refer to given caregivers can have multiple children enrolled.For youth 17 years or younger, the caregiver is the primary respondent.Participants 18 through 25 years old are the primary respondent.Surveys assess sociodemographic information [38], child birth history [39], special health care needs [39][40][41], SARS-CoV-2 infection history, related conditions (e.g., MIS-C, POTS or other form of dysautonomia, and Long COVID diagnoses), COVID testing and vaccine history, COVID-related symptoms (both acute and long-term), COVID health consequences (e.g., diet [42], physical activity [42], sleep [42], screen time [42], schooling, parenting [43]) and social determinants of health (e.g., food insecurity [44], social support [45]).A list of potential Long COVID symptoms are assessed [18,37] (Table 1), with respondents asked whether a specific problem or symptom is/was present for at least 4 weeks since the beginning of the COVID-19 pandemic and, for respondents with a history of infection, if the symptoms started before or after their infection.
Clinical assessments are completed at in-person Tier 2 visits across overarching domains of physical growth, physical health, neurocognition, and neurobehavioral function (S6 Table ).Physical health domains include anthropometrics, vital signs, an active standing test measuring orthostatic blood pressures [46,47], joint flexibility tests [48], electrocardiograms, and spirometry.Neurocognitive and neurobehavioral assessments vary by age (Table 4).Neurocognitive domains include broad and specific measures of attention, memory, receptive and expressive language skills, reading, and sensory function [49][50][51][52][53]. Neurobehavioral domains include a broad assessment of behavioral function including anxiety, mood, social interactions, aggression, sleep, self-regulatory behaviors, somatic complaints and attention concerns [54][55][56][57][58][59][60][61].Tier 3 assessments follow the same domains, but provide more in-depth measurements.The promotion algorithm for Tier 3 is still under development.Physical health domains of cardio-pulmonary function are assessed by echocardiogram, cardiopulmonary exercise testing, cardiac MRI, pulmonary function tests, and sputum induction.Having a sudden intense feeling of fear, like a panic attack e

Major
Refusing to go to school c Seeing, hearing, or feeling that something is there when it is not (hallucinations) c Being hyperactive or much more active than other children a Refusing to follow rules or doing what they are asked to do a Serious breaking of rules like lying, stealing, starting fights, or bullying a Having repeating memories, dreams, thoughts, or worries after a traumatic event a (Continued ) Gastrointestinal function is assessed using abdominal ultrasound, and neurological function is assessed using brain MRI, electroencephalogram, and measures of neurocognitive function and psychiatric symptoms.These assessments include higher level measurement of all cognitive domains (thinking, language processing, memory, attention, and executive functioning) [62], visual motor integration and speed [63][64][65], and a psychiatric symptom battery [66].
Biospecimens are collected across all Tiers using kits designed specifically for each visit, timepoint, and participant age (Table 5; S6 Table ).Tier 1 biospecimens consist of saliva and whole blood.Kits are shipped to homes for remote collection.Child and primary caregivers provide both saliva and blood; the other biological parent when available provides only saliva.Saliva is collected using Oragene devices (OGR-600) and banked for future DNA analysis.Whole blood is collected using a TASSO M20 device [67], which collects capillary blood using 4 volumetric sponges that each hold 17.5μL of blood (70 μL total).One sponge is used for SARS-CoV-2 spike and nucleocapsid antibody testing and remaining sponges are banked for future use.
Tier 2 acute biospecimens include saliva (Oragene OGR-600) and whole blood collections.All post-acute Tier 2 biospecimens consist of whole blood.The maximum amount of blood drawn at a single visit is age dependent.Whole blood is collected using serum separator tube (SST) and Ethylenediaminetetraacetic tube acid (EDTA) across all ages above 24 months and an additional cell preparation tube (CPT) is included for participants 6 years of age and older.e Symptoms included for children 12 years or older.

Statistical methods
We will estimate the proportion of children and young adults experiencing new onset or worsening of each symptom (incidence), stratified by age (0-5, 6-12, 13-17, 18-25 years), over time.Age stratifications were based on child developmental stages, including early childhood (birth to 5 years), school-age (6 to 11 years), adolescence (12 to 17 years) and young adulthood (18 to 25 years) [39].Prevalence within the recruited population will be estimated by calculating the point prevalence of each symptom by calculating the proportion of children and young adults who are currently experiencing each symptom at each study visit.The excess burden of

Acute arm
At enrollment, 17% of the "uninfected" group were randomly assigned to participate in the acute arm of the study.

100%
Post-acute arm At enrollment, 83% of the "uninfected" group were randomly assigned to participare in the post-acute arm of the study.40% of this group will be randomly assigned to participate in Tier 2

40%
a Responses to the PROMIS Global Health Scales and the presence of major and minor symptoms are used to categorize participants who are post-acute infected as high, medium, or low probability of PASC.b The PROMIS Global Health Scales are self-reported or caregiver-reported measures of overall, physical, and mental health for young adults and children, respectively [34][35][36].The three questions from the caregiver-reported version that are used in the algorithm, include: 1) "In general, would you say your child's health is?; 2) "In general, how would you rate your child's physical health?"; and 3) "In general, how would you rate your child's mental health, including mood and ability to think?" Responses include: Excellent, Very Good, Good, Fair, or Poor.

c
A list of all major and minor symptoms, reported at the enrollment visit as part of the symptom screener, is provided in the Table 1 [18,37].Not all symptoms are asked of all participants, as many are age-specific (e.g., fewer symptoms assessed for younger children) and sex-specific (e.g., menses related symptoms).
https://doi.org/10.1371/journal.pone.0285635.t002A preliminary working definition of PASC will be informed by using variable selection methods to identify which symptoms best differentiate children and young adults with and without an infection, following the methodology previously applied to develop a working definition of PASC in the RECOVER adult cohort [69].Data from the Tier 1 visit will primarily be used.The estimated associations obtained from regression models will be used to define a PASC score, with a cutoff for PASC defined based on clinical expertise while ensuring that the rate of those with no history of infection who are diagnosed as having PASC is reasonably low.This preliminary symptom-based working definition is intended for research purposes and not clinical diagnosis.It will be modified and augmented by clinical and subclinical findings as they become available.The working definition of PASC that is developed will also be validated in RECOVER participants who have linked EHR data against other definitions derived from EHR-based cohorts [70].While the working definition of PASC will be initially developed within each age strata, depending on the overlap of key symptoms that are identified that define PASC, some age groups may be aggregated in the interest of developing a more unified definition of PASC.To identify PASC phenotypes among children and young adults who are classified as having PASC defined by symptom patterns, we will use unsupervised learning methods to discover symptom clusters within each age strata (e.g., agglomerative hierarchical clustering [71] and consensus clustering [72]) to define PASC sub-phenotypes.
With this definition of PASC, we will conduct regression analyses to evaluate whether the risk of PASC and PASC sub-types differs by multiple factors, including demographic, clinical, and caregiver characteristics, social determinants of health, SARS-CoV-2 infection and immunization history, symptom severity during the acute phase of SARS-CoV-2 infection, and therapeutic exposures.Logistic and Poisson regression will be used to evaluate the association (i.e., odds ratios and risk ratios) between pre-infection factors and PASC as a binary outcome, and multinomial regression will be used when PASC sub-types are used as categorical outcomes.Among participants in Tier 2 who develop PASC, we will use time-to-event analyses (i.e., Cox proportional hazards regression) to identify factors that influence time to recovery from PASC.To investigate biomarkers related to PASC, clinical laboratory assessments will be compared between children and young adults who do and do not develop PASC.Mediation analyses will also be used to study the pathways by which SARS-CoV-2 infection leads to the development of PASC.Since the trajectory of how PASC manifests may be affected by the presence of pre-existing conditions, we will study separately the pathophysiology of PASC in children and young adults with and without such conditions, when appropriate.
The study aims cover a wide range of scientific questions, but not all analyses will involve hypothesis testing.For instance, defining PASC does not require hypothesis testing, but evaluating whether the definition of or rates of PASC differ between age groups does.When multiple comparisons are made across age groups or other defined strata, or when different exposures and outcomes are assessed within the same subgroups, multiplicity adjustments for testing of non-exploratory hypotheses will be performed using the Hochberg procedure, in which tests of significance are performed in order of decreasing p-value with increasingly stringent thresholds [73].This approach has been found to sacrifice less power in observational studies with correlated outcomes compared to the Bonferroni and other standard approaches for addressing multiplicity [74].
Potential sources of missing data include item nonresponse and attrition.Multiple imputation by chained equations will be the primary approach used to handle item nonresponse [75].Sensitivity analyses will include adjustments for potentially missing not at random data (i.e., informative missingness) using pattern mixture models, which permit the distribution of missing variables to differ between observed and unobserved values.Attrition, or missed visits, in the longitudinal phase of the study (Tier 2) will be addressed depending on the affected analysis.For time-to-event modeling, attrition induces censoring, which may not be independent if participants drop out of the study in a systematic fashion (i.e., participants with worse symptom trajectories may be less likely to continue to participate in the study).Inverse probability of censoring weights will be used to address dependent censoring in this context [76].For analyses with repeated measures, multiple imputation alongside likelihood-based methods, which are robust to the missing at random assumption, [77] will be used to handle missing data, with pattern mixture models used to perform sensitivity analyses for informatively missing data as appropriate [78].
Statistical analyses will primarily be conducted in R and SAS, though the statistical packages used by individual investigators for future analyses beyond those described in this manuscript will vary.

Power calculations
Power calculations for the de novo cohort were performed prior to recruitment using a type 1 error rate of 0.01 as a preliminary multiplicity adjustment.The actual statistical approach for addressing multiplicity will differ depending on the analysis and will involve more sophisticated methods (see Statistical methods), but these are not amenable to most power calculations.With 4,800 infected and 1,200 uninfected children and young adults from both acute and post-acute arms in Tier 1, as well as 10,000 children ages 12-17 from the ABCD cohort (assuming 3,500 are infected and 6,500 are uninfected), assuming the risk of a given symptom in the uninfected group is 10%, we have 90% power to detect a difference as small as 1.9% in the frequency of that symptom between groups.
In Tier 2, given the sampling and promotion framework described in Timing of Study Assessments, our sample with longitudinal follow-up will be skewed towards those who are more likely to have PASC.Following development of a definition of PASC, we consider the scenario in which we assume that of the 5,400 children and young adults with a history of infection in Tier 2, 3,600 meet PASC criteria and 1,800 do not.For a hypothetical risk factor with 50% prevalence in the PASC-group, we have 90% power to detect an odds ratio as small as 1.25 for the odds of PASC for those with the risk factor versus those without.For a factor with 25% prevalence in the PASC-negative group, the minimum detectable odds ratio is 1.28.In our Tier 3 sample of 600 children and young adults with history of infection (which includes additional data on biomarkers), assuming the sample has 400 with PASC and 200 without PASC, for a marker with 10% prevalence in the PASC-group, we have 90% power to detect an odds ratio as small as 2.60 for PASC.
Given that many analyses will be stratified by age group, we calculate minimum detectable effect sizes overall and within each age stratum in S7 Table .Estimates of the distribution of ages are based on early enrollment data, with 26% of main cohort participants in the age 0-5 years category, 28% in ages 6-11 years, 26% in ages 12-17 years, and 20% in ages 18-25 years.

Discussion
The overall goal of RECOVER-Pediatrics is to improve our understanding of recovery after SARS-CoV-2 infection, with a focus on the prevalence, natural history, and pathogenesis of PASC in children and young adults.Successful completion should lead to formal characterization of pediatric PASC as its own syndrome.This is essential to develop diagnostic, treatment, and preventive strategies tailored to children's unique physiology.
RECOVER-Pediatrics is well positioned to ascertain the epidemiology, four-year clinical course, and sociodemographic contributions to pediatric PASC, with rich data and biosamples available to readily test further mechanistic hypotheses, establish biomarkers, and provide insights into potential therapies.The meta-cohort is designed to provide details that are not available in other large epidemiologic or electronic health records queries, including a dynamic study design that can be flexible and responsive as new variants arise, and as our understanding of the long-term effects of SARS-CoV-2 evolves.RECOVER-Pediatrics was designed to include a wide range of ages, and diverse socioeconomic, racial, ethnic and geographic populations to ensure that findings are generalizable, and provide equitable benefit for all.
The generation-defining nature of the COVID-19 pandemic will impact the life course of children in ways that we have yet to fully understand.The unprecedented scope of RECO-VER-Pediatrics sets the stage for not only characterizing a new disorder that will impact children for years to come, but also for identifying and deploying solutions through its collaborations with investigators and communities across the country.
RECOVER-Pediatrics is expected to gather a rich data set that can be used to develop treatments for persons with Long COVID and provide guidelines for how to respond more quickly to prevent, reduce the consequences, and treat complications of future coronavirus outbreaks which are likely to emerge.

Fig 1 .
Fig 1. Overview of RECOVER-Pediatrics (de novo and ABCD cohorts).https://doi.org/10.1371/journal.pone.0285635.g001 Muscle weakness Pains in the joints (like the elbows, knees, ankles) a Pain in the back a Pain in the neck a Minor Sore muscles or pain in the muscles a Body aches or pains Symptoms or problems involving the brain and nerves Major Headache c Feeling dizzy (feeling like the room is spinning) c Shakiness or tremors c Feeling tingling or 'pin-and-needles' in the hands and feet c Unable to move part of the body Problems with remembering things (memory) a Problems with focusing on things (concentration), sometimes called "brain fog" a Problems with talking a Symptoms or problems involving feelings or behavior Major Feeling sad or depressed c Feeling anxious or on edge c Feeling a lot of fear when being away from parent or caregiver b Feeling a lot of fear of specific things like spiders or being up high d Feeling a lot of fear about being with other children or adults d Feeling fear of crowds or being in closed-in spaces c

Table 1 .*
of tantrums b Holding their breath for a long time when they are afraid or angry b Having nightmares Screaming in fear while asleep, sometimes called night terrors b Aggressive behavior like hitting, biting or kicking b Rocking the body back and forth or head banging The following question is used to assess potential PASC symptoms in RECOVER-Pediatrics: "Did your child have any of these problems or symptoms lasting for more than 4 weeks that started or got worse since the COVID pandemic began in March 2020?These are problems or symptoms that kept happening without stopping or kept happening again and again for longer than 4 weeks."** Symptoms were classified as major or minor depending on their severity and presumed likelihood of being associated with a COVID-19 infection.a Symptoms included for children 3 years or older.b Symptom included for children 5 years or younger.c Symptom included for children 6 years or older.d Symptoms included for children 2 years or older.

Table 2 . Promotion algorithm used in the de novo RECOVER-Pediatrics cohort for selecting children and young adults for the longitudinal follow-up (Tier 2).
�1 fair/poor responses on the global health PROMIS scale b and �1 major or minor symptom reported c 4. �1 good/fair/poor responses on the global health PROMIS scale b and � 1 major symptom reported c 5. �1 good/fair/poor responses on the global health PROMIS scale b and �2 minor symptoms reported c �1 good responses on the global health PROMIS scale b and �1 major or minor symptom reported c 2. �1 very good/good/fair/poor responses on the global health PROMIS scale b and �1 major symptom reported c 3. �1 very good/good/fair/poor responses on the global health PROMIS scale b and �2 minor symptoms reported c

Table 5 .
(Continued) SST tube is collected and within 4 hours of collection the SST tube is centrifuged, serum is aliquoted and frozen locally at collection sites.Serum aliquots are batch shipped frozen on dry ice in monthly intervals and are banked for future research.The EDTA tube is collected for all age groups and is processed for plasma, WBC, and RBC aliquots.A plasma aliquot is sent out for central testing.The other EDTA aliquot derivatives are frozen and banked for future research.The CPT tubes are only collected for age groups 6-25 years.The CPT tubes are centrifuged at collection sites and sent on ice packs day of collection to the PBC.Once arrived at the PBC, the CPT tubes are processed.A maximum of 8 x 1 mL PBMC aliquots (minimum of 5 million cells/mL) are derived.PBMC aliquots are stored in liquid nitrogen and banked for future research.
b c d e Tier 3 biospecimen parameters are currently under development, but will involve collection of whole blood for clinical chemistry and biobanking and the collection and banking of biospecimens for microbiome analysis.https://doi.org/10.1371/journal.pone.0285635.t005